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REGULAR FINITE FUEL STOCHASTIC CONTROL PROBLEMS 

WITH EXIT TIME 

DMITRY B. ROKHLIN AND GEORGII MIRONENKO 


Abstract. We consider a class of exit time stochastic control problems for diffusion 
processes with discounted criterion, where the controller can utilize a given amount 
of resource, called ’’fuel”. In contrast to the vast majority of existing literature, con¬ 
cerning the ’’finite fuel” problems, it is assumed that the intensity of fuel consumption 
is bounded. We characterize the value function of the optimization problem as the 
unique continuous viscosity solution of the Dirichlet boundary value problem for the 
correspondent Hamilton-Jacobi-Bellman (HJB) equation. Our assumptions concern 
the HJB equations, related to the problems with infinite fuel and without fuel. Also, 
we present computer experiments, for the problems of optimal regulation and optimal 
tracking of a simple stochastic system with the stable or unstable equilibrium point. 


1. Introduction 

Consider a controller, whose aim is to keep a stochastic system X in a prescribed 
domain G as long as possible. An inflnence on the system reqnires the resonrce (or 
fnel) expenditnre. The problem is to ntilize the given resonrce amonnt in an optimal 
way. 

The stndy of diffnsion stochastic control problems with bonnded amonnt of fnel 
was initiated in [1]. The problem was to bring a space-ship close to a stochastic 
target. Afterwards, the ”£nite fnel” issnes were developed, almost exclnsively, in the 
singnlar stochastic control paradigm of [HI 0] . In this framework it is always optimal 
to keep the system in a ”no-action region”. An essential step was made in [5]. This 
paper investigated the problem of optimal tracking of a Wiener process by a process, 
whose variation is bonnded by a given initial fnel amonnt. An explicit solntion was 
obtained for qnadratic disconnted criterion in the inhnite horizon case. In particular, 
it was mentioned that the optimal no-actions region becomes wider as the available 
fuel amount decreases. 

More general problems were then studied, e.g., in [32111^ 1^. The case of hnite hori¬ 
zon was considered in [301 ISl El] • Quite general results, concerning the characterization 
of the value function and the existence of optimal control strategies were obtained in 
HaESlESlEZI. A lot of studies were motivated by applications. We mention the prob¬ 
lems of optimal correction of motion [15] , controlling a satellite [28l [29l [27] , reaching 
the goal by a player mi 131], optimal liquidation and trade execution [39112111II11H]- 
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In contrast to the vast majority of existing literatnre, we assnme that the resonrce 
(or fnel) intensity consumption is bounded. In fact, only [321 IH] from the above 
references, adopt the same assumption. Conhning ourselves to classical controls, we 
deal with simpler mathematical problems which still can present interesting effects and 
have a wide range of applications. 

To give the precise formulation of our problem consider a standard m-dimensional 
Wiener process W = (1T\ ... W'^) dehned on a probability space (f2, P). Denote by 

F = {^s)s>o fhe minimal augmented natural hltration of W. Let the controlled process 
X = (X^,..., X'^) be governed by the system of stochastic differential equations 

dXt = h{Xt,at)dt + a{Xt,at)dWt, Xq = x. (1.1) 

An F-progressively measurable process a G A = [a, a], a < 0 < a is regarded as the 
intensity of resource consumption. We assume that the components of the drift vector 
6 : X A I—)■ and the diffusion matrix a : R'^ x A i—)■ R*^ x R™ are continuous, and 

|6(x,a) - h{y,a)\ + \a{x,a) - a{y,a)\ < K\x - y\ 

with some constant K independent of a. Note that this inequality implies also the 
linear growth condition: 

\b{x, a)| + \(j{x, a)| < K'{1 + |x|). 

Thus, there exist a unique F-adapted strong solution of (II.Ih on [0, cx)): see [33] (Chap. 
2, Sect. 5). The resource amount Y satishes the equation 

dYt = -\at\dt, Yq = y. (1.2) 

We call a process a admissible, and write a G A{x,y), if T) > 0, t > 0. The 
admissibility means that the resource overrun is prohibited. The solution of (II.ip . 
(II.2p is denoted by X^’^’", 

Let G C R'^ be an open set. It will be convenient to assume that 0 G G. In general, 
we do not require G to be bounded. Denote by = inf{t > 0 : Xf’^’" ^ G} the 
exit time of X^’^’" from G. The objective functional J and the value function v are 
dehned by 

J{x,y,a) = E at) dt, v{x,y) = sup J{x,y,a), (1.3) 

Jo a£A{x,y) 

where /3 > 0, and / : R'^ x A i—)■ R is a bounded continuous function. Note that for 
/ = 1 we obtain the risk-sensitive criterion 

= (1-4) 

related to the maximization of the expected time before leaving G. The mini¬ 
mization of as compared to the maximization of Ed^’^’", produces the controls 

for which the probability of an early exit is smaller (see [201 HD). 

As is mentioned above, suitably formulated problems of this sort appear in various 
applications. We indicate one more example, which motivated the present work. Let X 
describe the exchange rate between a domestic and a foreign currency. The controller 
is a central bank, trying to support the national currency and, thus, to hold X above 
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the level I (an actual problem for Russian Ruble). The available amount of the foreign 
currency to be sold in the market is Y. Mathematically this problem is quite similar to 
that of [28]. However, the use of regular controls (with bounded intervention intensity) 
may result in exit of X from G = {l,oo) before the currency resource (the ’’fuel”) Y 
is exhausted. Within our model, the central bank choose to stop interventions at the 
occurrence of the stopping time 6, regarding it as a signal to reduce the level I or to 
change the set of monetary policies. It should be noted that there is an extensive liter¬ 
ature concerning the exchange rates regulation. Without going into further discussion, 
we only mention that the prevailing paradigm is to model currency interventions by 
impulse controls: see, e.g., pSlilD] and references therein. 

The present paper is organized as follows. In Section [2] we prove that the value func¬ 
tion fll.Sp is the unique continuous viscosity solution of the correspondent Hamilton- 
Jacobi-Bellman (HJB) equation in the half-cylinder G x (0, oo) with Dirichlet boundary 
conditions: see Theorem [H The proof is based on the stochastic Perron method [6], 
adapted to exit time problems in [10|, and does not appeal to the dynamic program¬ 
ming principle. Our Assumptions [T]l3] are of analytical nature. They concern the 
well-posedness of the boundary value problems for the HJB equations, related to the 
problems with inhnite fuel and without fuel, as well as some nice properties of the 
solution of the latter. 

Theorem [T] allows to justify the convergence of appropriate hnite difference schemes. 
In Sections [3] and H] we present computer experiments, for the problems of optimal reg¬ 
ulation and optimal tracking of a simple stochastic system with the stable or unstable 
equilibrium point. 

2. Characterization of the value function 

For an open set (P C M'’* consider the differential equation 

F{x,u{x),u^{x),u^^{x)) = 0, xeO, (2.1) 

where is the gradient vector and u^x is the Hessian matrix. It is assumed that the 
function F : O x M. x x is continuous and satishes the monotonicity property: 

F(x, r,p,X) < F{x, s,p, Y) whenever r < s and W — R is positive semidefnite. 

Here S” is the set of symmetric n x n matrices. 

Let 5 ^ be a continuous function on dO. We consider two variants of the boundary 
conditions: 

u = g on do, (2.2) 

u = g OT F{x, u, Ux, Uxx) = 0 on dO. (2.3) 

The equation fl2.ip . as well as the boundary conditions fl2.2|) . fl2.3p . should be un¬ 
derstood in the viscosity sense (the classical reference is [H]). Recall that a bounded 
upper semicontinuous (use) function u is called a viscosity subsolution of the equation 
fl2.ip and the boundary condition 02.31) (resp., 02.21) 1 if for any z & O, p E C^(M"'), 
such that z is n local maximum point of u — p in O, we have 

F{z, u{z), ipx{z), (fxxiz)) <0 if z eO, 

u{z) < g{z) or F{z, u{z), (px{z), (Pxx{z)) <0 if z E dO, 
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(resp., u{z) < g{z) if z G dO). 

The notion of a viscosity supersolution is defined symmetrically. One should con¬ 
sider a bounded lower semicontinuous (Isc) function u and, assuming that z is & local 
minimum point of u — ip in O, postulate the reverse inequalities. 

A bounded function u is called a viscosity solution of fl2.ip . fl2.3p (or fl2.ip . fl2.2p i if 
its use and Isc envelopes: 

u*{x) = inf sup{m(?/) : y G B^{x) O O}, m*(x) = supinf{M(?/) : y G Bs{x) O O} 

£>0 £>0 

are respectively viscosity sub- and supersolutions of these equations. Here B^{x) is the 
open ball in M"' centered at x with radius £. Following [2], we say that fl2.ip . fl2.3p 
satishes the strong uniqueness property, if for any viscosity sub- and supersolutions u, 
w of fl2.ip . fl2.3p we have u < w on O. 

Consider the family £“ of ” inhnitesimal generators” of the diffusion process X: 

C^pix) = b{x, a)p^{x) + ^Tr {a{x, a)a^{x, a)ip^^{x)) 

and the Hamilton-Jacobi-Bellman (HJB) equation, related to the problem fll.ip - fll.3p : 

I3v - H{x,v^,Vy,v,^,^) = 0, {x,y) ell := G X {0,oo), (2.4) 


H{x,v,^,Vy,v^^) = sup {f{x,a) + C°'v-\a\vy}. 

a£[a,a] 

The aim of this section is to prove that the value function fll.3p is the unique continuous 
viscosity solution of fl2.4l) with appropriate boundary conditions (see Theorem [1]). This 
requires some preliminary work. 

Denote by C'^(G) the set of two times continuously differentiable functions on G, 
and by Gb{G) the set of bounded continuous functions on G. Put f{x) = f{x, 0). 

Assumption 1. There exists a solution -0 G Gb{G) fl G‘^{G) of the Dirichlet problem 

I3i>{x) — f{x) — C^'iIj{x) = 0, X G G; i/j = 0 on dG. (2.5) 


It is straightforward to show that such solution is unique and admits the probabilistic 
representation 

^/J{x) = E / e-^*/(Xf) dt, = inf{t >: Xt ^ G}, 

Jo 

dX^ = 6(Xf, 0) dt + a(Xf, 0) dWt, Xt = x 

(see [23], Chap. H, Theorem 2.1 and Remark 1 after it). Note, that V’ is the value 
function of the problem without fuel. 

One can see that fll.ll) - fll.3p combines the features of exit time and state constrained 
control problems. Fortunately, it is equivalent to a pure the exit time problem. Let 
be the exit time of (X*’^’", from the open set G x (0,oo). For stopping 

times T < a with values in [0, ool we denote by Ifr, cr| the stochastic interval \(oj,t) G 
Ox [0,cx)):r(n;)<t<(T(n;)}. 
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Lemma 1. Under Assumption]^ the value function admits the representation 


v{x, y) =supe( [ e dt + e ) , 

a&A \Jo 


(2.6) 


where Li is the set of all ¥-progressively measurable strategies a with values in A. 

Proof. Note, that A inf{t > 0 : = 0}, and any admissible strategy 

a G A{x,y) shonld be switched to 0 as far as ’’the fnel is exhansted”: at = 0, 

t G Hence, the fnnctional fll.dp can be represented as 

rpx,y,OL 

J{x, y,a) = E / e~^^f{Xf’^’°‘, at) dt 


'0 
Eils 


/ e-l“f{Xf-ndt 

'Jfx,y,OL 


I ^fpx,y,a^Qx,y,ay 

On the stochastic interval we have 


x,y,a 


(2.7) 


A'f'»-" = e+ [ 6(XW,0)<is+ f cr{X‘-’‘-‘-,0)dW„ { = ASi". 

J'j'x,y,a Jj'x,y,a 

For the solntion f) of the Dirichet problem fl2.5|) by Ito’s formnla we get 

+ [ e-^\C\ - ds + Mt 

J'j'x,y,a 

= ds + Mt on 6^’^’% (2.8) 

Jj'x,y,a 

where Mt = • ct(X^’^’“, 0) dhF, is a local martingale. The last 

eqnality shows, however, that M is bonnded. Hence, M is a nniformly integrable 
continnons martingale with Mx^.y.a = 0 on For any stopping time r, 

snch that 

rpx,y,d. ^ ^ ^ Qx,y,a ^j,x,y,a ^ QX,y,a^^ ('2.9) 

by taking the conditional expectation, from fl2.8p we obtain 


^.j^7'3:;,y,cK^^a:;,y,0:1, E 


e-^7(X,^’^’") ds 


I 'J'x,y,< 


t^'T'x,y,c 


( 2 . 10 ) 


Consider an expanding seqnence of compact sets snch that U„>iG„ = G and pnt 

V inf{t > 0 : ^ G„}. 

Clearly, r„ 7 and satisfy the condition fl2.9l) imposed on r. Fnrthermore, 

lim e~^'^"'ip(Xf’'^’AlsT^,y,c^0x,y,a} = 0 a.s. 

n^oo ^ ^ 

by the bonndary condition fl2.5l) and the bonndedness of if. By the ineqnality 

E ^^"ljj(^Xff^’'^)Ippx,y,a^0x,y,a-^ L^J'x,y,a^ ^ Ejc if (yXff^'‘^) Ippx ,y ,a .^Qx ,y 
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and the dominated convergence theorem it follows that 






^ 0 in L\ 


Passing to a snbseqnence, one may assnme that this seqnence converges with probability 
1. Then from fl2.10l) we get 


t ^'J'x ,y ,a. ^Qx ,y ,a.y [ 


^Qx,y,a 


ds 


! 'Y'x,y,OL 


\ j _ i^'j^x,y,a. I / 

y,OL I — 1 ^'Y'x ,y ,oi. ^Qx ,y t\^)' 


This completes the proof, since (12.7p takes the form 


n'j'x ,y ,oc 




J{x,y,a) = E 


which is eqnivalent to the representation 02.61) in view of the bonndary condition 02.5p . 

□ 


Denote by v the valne fnnction of the problem with ’’inhnite fnel”: 

v{x)=snpE e-^V(Xfat) df, = inf{t > 0 : Xf’" ^ G}, (2.11) 

«ew Jo 

where X*’" is the solntion of 01.ip . Consider the correspondent HJB eqnation and the 
bonndary conditions: 

I3v— snp {/(x, a) + £“T} = 0, x e G, (2.12) 

a£\a,a] 

fJv— snp {/(x, a) + £“n} = 0 or n = 0 on 9G. (2-13) 

aG [a,a] 

Assumption 2. The bonndary valne problem 02.121) . 02.13P satishes the strong nniqne- 
ness property. 

Let ns mention a simple snfficient condition, ensnring the validity of Assnmption [2l 
Snppose that dG is of class G^ and denote by n(x) the nnit onter normal to dG at 
X. The Assnmption |2] holds trne if the diffnsion matrix does not degenerate along the 
normal direction to the bonndary: 

a{x,a)n{x) ^ 0, (x, a) G dG x A. (2-14) 

Indeed, let n, w be bonnded viscosity snb- and snpersolntion of 02.12p . 02.13p . By 
Proposition 4.1 of pQ, the generalized Dirichlet bonndary condition fl2.13p is satished 
in the nsnal sense: n < 0 < tc on dG. Tims, we can apply the comparison resnlt 
[261 Theorem 7.3], [36l Theorem 4.2] (for not necessary bonnded domain G) to get the 
ineqnality u < w on G. 

Assumption 3. There exists a constant X > 0 snch that 

snp {|/(x, a) — /(x, 0) + C°‘'ip(x) — C^'iIj{x)\] < K\a\. 

x£G 
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This assumption is satisfied, e.g., if / is Lipshitz continuous, is strictly elliptic and 
dG is a bounded domain of Holder class a > 0. Indeed, by the classical Shauder 
theory we have V’ £ (see [25], Theorem 6.14) and, in particular, the derivatives 

of -0 up to second order are uniformly bounded in G. Note, that Assumption 1 also 
holds true in this case. 

Define a continuous function g on c^H by the formulas 

5f(0, x) = 'ip{x), xeG] g{x, y) = 0, xedG, y> 0. (2.15) 


Theorem 1. Under Assumptions 1-3 the value function v is a unique hounded viscosity 
solution of ^2-4^ , which is continuous on H and satisfies the boundary condition 

V = g on tin. (2-16) 


A specific feature of the equation (12.dh is the form at which it degenerates at the 
boundary points (x,0). Such degeneracy does not allow to apply directly the strong 
comparison result of [H Theoorem 2.1], [141 Theoorem 2.1]. It is possible to apply the 
results of [36] after some additional work. We, however, pursue another away, utilizing 
the stochastic Perron method, developed in [6]. Using the result of [40], this allows to 
give a short direct proof of Theorem [1] without relying on the dynamic programming 
principle. 

Let r be a stopping time, and let [f, g) be a bounded ^T-measurable random vector 
with values in H. Consider the SDE fll.ip with the randomized initial condition (r, g)\ 

h{Xs,as)ds+ [ a{Xs,as)dWs, (2-17) 


+ J 


b) v^{t>T} 



( 2 . 18 ) 


As is known, see [43] (Chap. 2, Sect. 5), there exists a pathwise unique strong solution 
of fl2.17l) . fl2.18p for any a &U. To reconcile this notation with the 
previous one we drop the index r for r = 0: for instance. 

For a continuous function u on H define the process 






Vr,i,r),a 

1 


Definition 1. A function u G C(n), such that «(•, y) is bounded, is called a stochastic 
subsolution of fl2.4p . fl2.16p . if u < on cIH and for any randomized initial condition 
(r, g) there exists a eU such that 

for any stopping time p G [r, . 

Definition 2. A function w G C'(n), such that w(-, y) is bounded, is called a stochastic 
supersolution of fl2.4p . fl2.16p . ii w > g on cIH and 

E(^uC,7,«(^)|^^) < = e-^^w{i,g) 

for any randomized initial condition {T,f,g), control process a Eli and stopping time 
p E [r,T^’«’^’“]. 
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In this form the notions of stochastic semisolntions are tailor-made for exit time 
problems (see [ID])- In the present context we need not assume that u, w are bounded in 
y, since for any bounded initial condition (^, rf) the process remains bounded up 

to the exit time from 11. Note that may be defined arbitrary. 

Denote the sets of stochastic sub- and supersolutions by V_ and V+ respectively. Any 
stochastic subsolution u bounds the value function v from below and any stochastic 
supersolution w bounds v from above: 


:= sup u < V < := inf w on 11. 

mSV- 'WGV+ 

To see this put r = 0,^ = x,ri = y and p = in Definition [1) 


(2.19) 






e-f^^f{X:’y’^,as)ds 


+ E < v{x,y). 

The last inequality follows from the representation 112.6p since 

C it y-Y\.'j'x,y,a 5 'J~'x,y,a J ^ y \-^^x ,y ,ol 5 'Y'x.,y ,cx. ) ^ \ jO 


Similarly, by Definition [2] we have 


'Y'X^yy 


ryX,y,ai 
^0 


\w) = w{x, 2/) > E y «.) ds 

for all a &U. Thus w{x, y) > v{x, y), since 

l3T^,y,a , x,y,a y-x.y.a \ ^ -y-x.j/.o \ _ \ 

O LV yYX'j^x,y ,a 5 J- 'j^x,y,a J y yY\.rjnx,y,a 5 J- 'j'x,y,a / ^ \“^'J'x,y,a J • 


Lemma 2. Put u{x, y) = ^fj^x), w{x, y) = '0(a;) -|- cy. Under Assumptions\J\ 0 we have 
M G V_, tc G V+ for c > 0 large enough. Moreover, v, defined by h2.11\) . is a stochastic 
supersolution under Assumption 2. 


Proof, (i) Similar to 112.Sp . by Ito’s formula and the dehnition of fj we get 




{u) 


0) ds + + Mt 


( 2 . 20 ) 


on |r, where Mt = j^e 0) dWs is a local martingale. 

Since M^. = 0 on {r < and M is bounded, as follows from 112.2Up . we have 

for a stopping time p, satisfying the inequality 

T<P< on {r < ^2.21) 


Moreover, since any F-stopping time is predictable (see [SI Proposition 16.22]), we may 
extend 112.2ip to a stopping time p < by the continuity argument. 

It follows that M is a stochastic subsolution: 


E(ZyA"(«)|.^,) = = Zp'"-")!*), P 6 


( 2 . 22 ) 
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since on {r = this equality is trivially satisfied. Note, that the control process 

a = 0, ensuring fl2.22p . does not depend on although such dependence is 

allowed by Definition [H 

(ii) To prove that w = 'ip{x) + cy is a. stochastic supersolution of fl2.ip . fl2.2p . we also 
apply Ito’s formula: 


y) + Nt + Mt on |r, 

+ ^ - \a\c - ds, 


By Assumption [3] there exists a constant A' > 0 such that 


|/(a;,a) + £Xa;) - /d'lpix)] = \f{x,a) -/(x,0) + £Xx) 


C^'ijj{x)\ < K\a 


Hence, 


\Nt\ < / e ^*(A'|as| + c|q;s| +/dcr^) ds < i^'. 


Nt < e ^*(7^ — c)|as| ds < 0 for c > dT. 


It follows that the local martingale M is uniformly bounded on |r, and 

for a stopping time p, satisfying (I2.2ip . As in the proof of part (i), one can extend this 
inequality to a stopping time p < to obtain 

which means that w is a stochastic supersolution. 

(hi) By [ID] (see Theorem 1 and Remark 1) n G C{G) is the unique bounded vis¬ 
cosity solution of fl2.12p . fl2.13p . and it satisfies the boundary condition T = 0 on dG. 
The argumentation of [ID] shows that there exist a decreasing sequence of stochastic 
super solutions Wn of fl2.12p . fl2.13l) . converging to v. From Definition [2] it follows that 
T is a stochastic supersolution of fl2.12p . fl2.13l) . Since v does not depend on y and 
V >'ll) (and thus n > on SH), from the same definition it follows that v a stochastic 
super solution of fl2.4p . fl2.16p . □ 


Proof of TheoremUl By Lemma [2] there exist a pair u, w of stochastic sub- and super¬ 
solutions such that 

u = 'll) = 'll} onGx {0}, 
and there exist another such pair m, v such that 

M = 0 = T on dG X [0, oo). 
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By the definition (I2.19p of w+ it follows that 

on 911. (2.23) 

Furthermore, it was proved in [iQ] (Theorems 2, 3) that (resp., w+) is a viscosity 
supersolution (resp., viscosity subsolution) of fl2.4p . By the comparison result (see [26l 
Theorem 7.3], [36l Theorem 4.2], |l3l Theorem 6.21] for the case of unbounded domain) 
and fl2.23p we get the inequality on fl. Combining this inequality with fl2.19p . 

we conclude that 

U- = V = w+ on n. 

Hence, n is a continuous on G viscosity solution of fl2.4p . which satisfies the boundary 
condition fl2.16p in the usual sense. The uniqueness of a continuous viscosity solution 
follows from the same comparison result. □ 

Remark 1. Theorem [T] can be adapted to the case of finite horizon T. In this case 
one should consider the extended controlled process X = it,X) and the domain G = 
(0, T) X G instead of X and G. The HJB equation G x (0, oo) is analysed along the 
same lines. The correspondent parabolic equations fl2.5p . fl2.12p can be considered as 
degenerate elliptic: see, e.g., [M] for linear case, and [HI Corollary 3.1] for the strong 
comparison result in the nonlinear case. The latter is needed to make Assumption 2 
constructive. 

Remark 2. The Dirichlet condition n(a;, 0) = ‘ipix), x ^ G, where ip is the value 
function of the uncontrolled problem without fuel, is quite natural and is commonly 
used in finite fuel control problems (see, e.g., [221 Sect. VIII.6]). However, some authors 
use another condition, which is typical for state constrained problems: see [39l [35] . 

Remark 3. The idea of utilization of stochastic semisolutions, satisfying the sought- 
for boundary conditions (see Lemma [2|), in order to reduce the proof of Theorem [1] to 
a standard comparison result, is borrowed from [7]. 

3. Optimal correction of a stochastic system 

Consider a simple one-dimensional {d = 1) controlled stochastic system 

dX = —kXdt -|- adWt — atdt, 
dY = —\at\dt, 

where cr > 0, A: are some constants and at G [a, a]. The case A: > 0 (resp., k < 
0) corresponds to the stable (resp., unstable) equilibrium point 0. An infinitesimal 
increment dX of the system can be corrected with intensity a. Controller’s aim is to 
keep the system in the interval G = (—/,/), / > 0 as long as possible. More precisely 
we consider the risk-sensitive criterion of the form fll.4p . By Lemma [T] we pass to the 
exit time problem 

( rpx,y,a 

/ dt + 

Jo 
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where V’ is the solution of the Dirichlet problem for the ordinary differential equation: 

(3i) -1 + kx'ipx - = 0, X e (-/, 1) 

^(-l) = = 0. 

Clearly, Assumptions 1-3 holds true. By Theorem [T] n is the unique continuous 
bounded viscosity solution of the equation 

(3v — 1 — + min {(/cx -f a)vx + \a\vy} = 0, {x, y) G (—/, 1) x (0, oo) (3.1) 

2 a€[a,a] 

which satishes the boundary conditions: 

v{x, 0) = X G [-/, /]; v{-l, y) = v{l, y) = 0, y > 0. (3.2) 

To solve the problem fl3.ip . fl3.2p numerically we consider the rectangular grid 

Gh = {xij = (fhi,j/ 12 ) : -I <i< I, 0 <j < J}, Ihi = /, Jha = y, 

where are integers, h = (hi, ha) are the grid steps, and y corresponds to 

the artihcial boundary. Put Gh = {xtj : —I < i < 1,0 < j < J} and denote by 
dGh = Gh\Gh the ’’parabolic boundary” of the grid. Consider the system of equations 




hi 


-|- min < (kxi j + a)~ 

ae\a,a] ' " 


hi 


{kxij + 


\ — ^*+ 1 J 


hi 


I I '^i,j ~ 

“t” U ' 
ho 


— 0, Xij G Gh', 


(3.3) 

(3.4) 


Vij - g{xij) = 0, Xij G dGh 

for the mesh function Vij = Vh{xij). The function g is dehned by fl2.15p . Equations 
fl3.3p . fl3.4p can be represented in the form 

Fhi^ij, Vij, {vij '^i'j')xyj/Gr{xij)) 0, Xij G Gh, (3-5) 

where T{xij) is the set of neighbors of Xij'. 

{^i+l,jy y ^ij ^ Gh, r(Xjj) 0, Xij G dGh- 

The function Fh is nondecreasing in each variable, except of Xij. In the terminology 
of [38] the scheme fl3.5p is degenerate elliptic. The inequality 


Fh{x, r, y) — Fh{x, r', y) = (3'{r — r') >0, (3' = min{/5,1} for r > r', 

. Furthermore, the scheme is Lipshitz continuous 


means, that scheme is proper 
with constant Kh'- 


\Fh[x,z] - Fh[x,z'] \ < Kh\\z - z’ 

,2 


(3.6) 


0 “" |h|/ -|- a a 


/4 = max{l.,3} + ^ + ^^ + -. 


We use the notation Fh[x,Vh] for the left-hand side of fl3.5l) . and ||^|loo = maxdzjjl 
Zij E Gh}- The proof of fl3.6p is based on the elementary inequality 

I max 0(a:, q) — max 0(i/, q) \ < max 10(x, g) — 0(i/, g) I, 
qeQ qeQ q&Q 
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which is valid for a continuous function 0 and a compact set Q. 

In |3S] (Theorem 7) it is shown, that the operator Sp{v) = v — pFh[x,v] is a strict 
contraction in the space of mesh functions, equipped with norm || • ||oo, for sufficiently 
small p: 

||S'p(m) - S'p(n)|loo < 7l|w - 7 = max{l - p(^',pKh}. 

It follows that fl3.5p has an unique solution Vh, which coincides with the hxed point of 
S'p, and it can be approximated by the iterations 

= (3.7) 

with an arbitrary v^. 

Theorem [1] allows to justify the convergence of mesh functions u/i, h —)• 0 to the value 
function v by the Barles-Souganidis method O Theorem 2.1], To do this one should 
check the stability, monotonicity and consistency properties of the hnite difference 
scheme fl3.5p (see [2] for the dehnitions). 

The monotonicity property means that the function Fh{xij,Vij, {vij —'Vi'j>)xp.,£r{xij)) 
is nonincreasing in Vi/j/, and follows from the fact that the scheme is degenerate elliptic. 
Furthermore, since 0 <'ip < 1/(3, we get the inequalities 

Fh[x,0] < Fh[x,Vh\ = 0 < Fh[x, 1/0], 

which imply the stability property: 0 < u/i < 1/0 by |3H1 Theorem 5]. The proof of the 
consistency property is based on Taylor’s formula: see, e.g., |12] for a simple example. 
We do not go in details here. 

The computer experiments were performed for the following set of parameters: 0 = 
0.1, a = 0.8, I = 1, y = 40, a = —a = 10. To analyze the influence of the attraction 
(repulsion) rate on optimal strategies we considered the values k G [—10,10]. The 
calculations were implemented on the 200 x 200 grid, covering the rectangle [—1,1] x 
[0,40]. 

Choose p = 1/ {2Kh) in the iteration method fl3.7p and take = 0 and = 1/0 as 
initial values. It is easy to see that 

Sp{v°) - = -pFh[x,v^] > 0 , Sp(y°) - = -pFh[x,v^] < 0 . 

From the monotonicity property of the operator Sp ([381 Theorem 6]) it follows that 
the iterations with these initial values converge monotonically: '[ Vh, i Vh- The 

iterations were performed until 

<e = 0.01. (3.8) 

Typically this required about 400 thousand steps. 

For k = 2 the graph and the level sets of the value function are presented in Figure 
[H Clearly, v is symmetric with respect to the axis x = 0, and is increasing in y. 

The switching lines of optimal strategies a*, corresponding to optimal values of a 
in fl3.3p . are shown in Figure [2l The middle area, containing the equilibrium, is the 
no-action region Gna, where a* = 0. In the complement area we have a* = —10 near 
the upper boundary x = I, and a* = 10 near the lower boundary x = —1. 

The no-action region widens when y decreases. This means that the controller be¬ 
comes less active when the recourse Y runs low. More interesting and unexpected 
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Figure 1. The value function (a) and its level sets (b) for k = 2. 



Figure 2. Optimal control in the stable case. 

effect concerns the ” non-monotonic” behavior of the no-action region with respect to 
k. It was detected experimentally that Gna becomes wider, when k grows from 0 to 3.5. 
Thus, the controller is less involved in the stabilization of the system, which becomes 
more stable itself. But, for k > 3.5 we observe the opposite picture: the no-action 
region narrows as k grows further! 

Optimal strategies for the unstable case A: < 0 are presented in Figure 3. Here no¬ 
action regions are much smaller. It is not surprising since it is more difficult to keep the 
unstable system near the equilibrium point. In contrast to the stable case, here Gna 
shrinks monotonically in k. Also, the value function v is smaller. We do not present 
the graph of v, since it looks similar to Figure (U^a). 
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Figure 3. Optimal control in the unstable case. 

4. Optimal tracking of a stochastic system 

Consider a random target which should be tracked by the controlled process 
The fluctuations of X^ are described by the equation 

dX] = fi{Xl)dt + adWt, a > 0, 

F(^i) T 6}: b ^ 0 

The case k > 0 (resp., fc < 0) corresponds to the stable (resp., unstable) equilibrium 
point 0 of the correspondent deterministic system. The dynamics of the tracking 
process X^, controlled by the ’’fuel expenditure”, is unaffected by noise: 

dX^ = atdt, dYt = —\at\dt^ at E [a, a]. 

We assume that the tracking is stopped if the target is ’’lost sight of’: 

r = inf{t > 0 : |X/ - X‘^\ > /}, l> 0. 

For the objective functional fll.41) the HJB equation 02.41) takes the form 

fdv-I - min {|a|nj^ - ana;^} = 0, (x, y) e G x (0, cx)), 

Z aG[a,a] 

where G = {x : |a;i — a; 2 | < /}. The boundary conditions 02.16P shapes to 
n = 0 on dG x [0, oo); v = 'ip on G x {0}, 
where ip is the solution of the boundary value problem 

1 - = 0, Xi ^ {X2 - I, X 2 + 1), (4.1) 

ip{x2 - I, X2) = ‘Ip{x2 + /, X2) = 0. (4.2) 
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Let us check Assumptions 1-3. Let V’l, "02 G C'^(M) be a fundamental solution system 
of the ordinary linear differential equation fl4.ip . Then 

1p{Xx,X2) = Ci{x2)tpl{Xx) + C2{x2)'tp2{Xi) + 1//3, 

where the Ci, C 2 are uniquely dehned by the boundary conditions fl4.2p . It follows that 
'ip G C‘^{G) and Assumption 1 is satished. Assumption 2 holds true since the condition 
fl2.14l) is met. To verify Assumption 3 it is enough to show that ip and its derivatives 
up to second order are uniformly bounded in G. But this property follows the fact that 
for I a: I large enough, p is constant, and we have ip = (f{xi — X 2 ), where ip{z) is dehned 
by 

/3ip - 1 - = 0, ze{-l,l); = (p{l) = 0. 

To solve the problem numerically, as in the previous example, we use the monotone 
hnite difference scheme: 




hi 


hi 


2^*—Ijjfc T Ui+ljA 

2hf 

Xi.j.k Xi j h—i iVi j^i k Xi j u _Vi j k Pi,? —lA 

-^- 1 - _U -- 1 _ ^ - ,J -^ 


— a 


+ mm { \a 

aG [a, a] 






h2. 


1 Xijk G G/i, 


0 Xijk 9^^ijk)i Xijk ^ dGfi- 


The grid Gh is the subset of points {xijk = {ihi, jh2, kh^) G G(x,y) : {i,j,k) G Z x 
Z X Z+}. The set G(x,y) = {|a;i — a; 2 | < I, |a;i -f a; 2 | < x, y E [0,y]} is cut out 
from G X [0,oo]. The values x, y determine the artihcial boundary. As in Section |3l 
by dGh we denote the parabolic boundary of Gh-, that is, dGh contains all points of 
Gh n dG{x,y), except of those with maximal values of k index. Other points of the 
grid we attribute to Gh- 

The scheme is analyzed along the same lines as in Section [3l The convergence of its 
solution Vh to the value function v follows from Theorem [1] by the Barles-Souganidis 
method. The grid function Vh is obtained by the iterations fl3.7l) . The initial approxi¬ 
mations = 0, = 1//5 and the stopping criterion fl3.8p still apply. 

In experiments we used the following parameters: [3 = 0.1, a = 0.8, b = 2.5, 
I = 4/\/2, y = 10, T = 80, a = —a = 1. The grid contained 2000 x 100 x 50 nodes Xijk- 
With e = 0.01 in 03.81) . iterations typically stopped after 10 thousand steps. 

It is convenient to make the rotation transform 


Zi = {xi + X2)/V2, Z 2 = {xi - X 2 )/\pi 

and present the results in the new variables (zi,Z 2 )- The switching lines of optimal 
control in the stable (k = 0.3) and unstable (k = —0.3) cases are shown in Figures 0] 
and [5] respectively (for y = 1). 

The domains between solid lines correspond to the no-action sets. The dashed lines 
determine the switching of optimal control for the problem fl2.12l) . fl2.13p with inhnite 
fuel (here the no-action region is empty). Since y is constant for |a;i| > b, the switching 
lines stabilize for \zi\ large enough. Moreover, for large > 0 the no-action region is 
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Figure 4. Optimal control in the stable case, k = 0.3. 



Figure 5. Optimal control in the unstable case, k = 0.3. 


located above (resp., below) the line ^2 = 0 in the stable (resp., unstable) case. The 
reason is that in the stable case, for a = 0, the point (Z/, Z'f) = {X^+Xq, X/ —Xq )/\/2 
with large Zq > 0, on average, goes from the upper boundary of the strip (^ 1 , 2 : 2 ) £ 
M X {—1,1) to its lower boundary. In the unstable case there is an opposite trend. For 
Zi < 0 the pictures can be recovered by reflection with respect to the origin. 

Examples of graphs and level sets of the value functions v in the stable and unstable 
cases are given in Figures [6] and [71 

For fixed Zi,y the value functions attain their maximum in Z 2 in the no-action regions. 
The global maximum in the stable case is attained at the origin. However, in the 
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(b) 


Figure 6. The value function (a) and its level set (b) for k = 0.3. 




Figure 7. The value function (a) and its level set (b) for k = —0.3. 

unstable case, the maximum points of v are located in the those parts of no-action 
sets, where \zi\ is large and v is approximately constant in zi. 
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